Near-Neighbor Search in Pattern Distance Spaces

نویسندگان

  • Haixun Wang
  • Chang-Shing Perng
  • Philip S. Yu
چکیده

In this paper, we study the near-neighbor problem based on pattern similarity, a new type of similarity which conventional distance metrics such as Lp norm cannot model effectively. The problem, however, is important to many applications. For example, in DNA microarray analysis, the expression levels of two closely related genes may rise and fall under different external conditions or at different time. Although the magnitude of their expression levels may not be close, the patterns they exhibit over the time or under different conditions can be very similar. In this paper, we measure the distance between two objects by pattern similarity, i.e., whether the two objects exhibit a synchronous pattern of rise and fall under different conditions. We then present an efficient algorithm for near-neighbor search based on pattern similarity, and we perform tests on several real and synthetic data sets to show its effectiveness.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification, with Applications to Object and Shape Recognition in Image Databases

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time,...

متن کامل

Asymmetric learning vector quantization for efficient nearest neighbor classification in dynamic time warping spaces

The nearest neighbor method together with the dynamic time warping (DTW) distance is one of the most popular approaches in time series classification. This method suffers from high storage and computation requirements for large training sets. As a solution to both drawbacks, this article extends learning vector quantization (LVQ) from Euclidean spaces to DTW spaces. The proposed LVQ scheme uses...

متن کامل

A Simple Algorithm for Nearest Neighbor Search in High Dimensions

The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In nearly all applications, the closest point is of interest only if it lies within a user-specified distance e...

متن کامل

SIMP: Accurate and Efficient Near Neighbor Search in Very High Dimensional Spaces

Near neighbor search in very high dimensional spaces is useful in many applications. Existing techniques solve this problem efficiently only for the approximate case. These solutions are designed to solve r-near neighbor queries only for a fixed query range or a set of query ranges with probabilistic guarantees and then, extended for nearest neighbor queries. Solutions supporting a set of query...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005